Optimal Tabular Releases from Confidential Data
نویسندگان
چکیده
We describe and illustrate NISS-developed optimal tabular release technology, which releases sets of sub-tables of large contingency tables that maximize data utility (in our examples, the number of sub-tables released) subject to a constraint on disclosure risk (tightness of bounds on small-count, risky cells in the underlying table). This approach explicitly accommodates the mandate of Federal statistical agencies to protect data confidentiality and their mission to disseminate information derived from the data.
منابع مشابه
Partial Information Releases for Confidential Contingency Table Entries: Present and Future Research Efforts
Tabular data have been a staple product for disseminating information derived from the confidential microdata that fuel social science research and inform policy decisions. This paper outlines recent results on disclosure risk assessment associated with the release of high-dimensional contingency tables, and discusses some related research problems. The main focus is the partial information rel...
متن کاملRecent advances in optimization techniques for statistical tabular data protection
One of the main services of National Statistical Agencies (NSAs) for the current Information Society is the dissemination of large amounts of tabular data, which is obtained from microdata by crossing one or more categorical variables. NSAs must guarantee that no confidential individual information can be obtained from the released tabular data. Several statistical disclosure control methods ar...
متن کاملSoftware Systems for Tabular Data Releases
We describe two classes of software systems that release tabular summaries of an underlying database. Table servers respond to user queries for (marginal) sub-tables of the “full” table summarizing the entire database, and are characterized by dynamic assessment of disclosure risk, in light of previously answered queries. Optimal tabular releases are static releases of sets of sub-tables that a...
متن کاملSoftware for tabular data protection.
In order for national statistical offices to maintain the trust of the public to collect data and publish statistics of importance to society and decision-making, it is imperative that respondents (persons or establishments) be guaranteed privacy and confidentiality in return for providing requested confidential data. Consequently, for most survey and census data, disclosure limitation techniqu...
متن کاملPerspective Reformulations of the CTA Problem with L2 Distances
Any institution that disseminates data in aggregated form has the duty to ensure that individual confidential information is not disclosed, either by not releasing data or by perturbing the released data, while maintaining data utility. Controlled tabular adjustment (CTA) is a promising technique of the second type where a protected table that is close to the original one in some chosen distanc...
متن کامل